batch-cluster
Support external batch-mode tools within Node.js.
data:image/s3,"s3://crabby-images/f4c7f/f4c7f2b650a0609a32fd9ac543baedded244248d" alt="Build status"
Many command line tools, like
ExifTool and
GraphicsMagick, support running in a "batch
mode" that accept commands provided through stdin and results through stdout. As
these tools can be fairly large, spinning them up can be expensive (especially
on Windows).
This module expedites these commands, or "Tasks," by managing a cluster of these
"batch" processes, feeding tasks to idle processes, retrying tasks when the tool
crashes, and preventing memory leaks by restarting tasks after performing a
given number of tasks or after a given set of time has elapsed.
This package powers
exiftool-vendored, whose
source you can examine as an example consumer.
Installation
Depending on your yarn/npm preference:
$ yarn add batch-cluster
$ npm install --save batch-cluster
Usage
The child process must use stdin
and stdout
for control/response.
BatchCluster will ensure a given process is only given one task at a time.
If these links are broken, use https://batch-cluster.js.org/
-
Create a singleton instance of
BatchCluster.
Note the constructor
options takes a union
type of
-
The default logger writes warning and
error messages to console.warn
and console.error
. You can change this to
your logger by using setLogger.
-
Implement the Parser class to parse results from your child
process.
-
Construct a Task with the desired command and
the parser you built in the previous step, and submit it to your BatchCluster
singleton's
enqueueTask method.
See
src/test.ts
for an example child process. Note that the script is designed to be flaky on
order to test BatchCluster's retry and error handling code.
Versioning
The MAJOR
or API
version is incremented for
- 💔 Non-backwards-compatible API changes
The MINOR
or UPDATE
version is incremented for
- ✨ Backwards-compatible features
The PATCH
version is incremented for
- 🐞 Backwards-compatible bug fixes
- 📦 Minor packaging changes
Changelog
v2.2.0
- 🐞 Windows taskkill
/PID
option seemed to work downcased, but the docs say
to use uppercase, so I've updated it. - 📦 Upgrade all deps including TypeScript to 2.9
(v2.1.2 is the same contents, but np
had a crashbug during publish)
v2.1.1
- 📦 More robust
end
for BatchProcess
, which may prevent very long-lived
consumers from sporadically leaking child processes on Mac and linux. - 📦 Added Node 10 to the build matrix.
v2.1.0
- 📦 Introduced
Logger.trace
and moved logging related to per-task items down
to trace
, as heavy load and large request or response payloads could
overwhelm loggers. If you really want to see on-the-wire requests and results,
enable trace
in your debugger implementation. By default, the
ConsoleLogger
omits log messages with this level.
v2.0.0
- 💔 Replaced
BatchClusterObserver
with a simple EventEmitter API on
BatchCluster
to be more idiomatic with node's API - 💔 v1.11.0 added "process reuse" after errors, but that turned out to be
problematic in recovery, so that change was reverted (and with it, the
maxTaskErrorsPerProcess
parameter was removed) - ✨
Rate
is simpler and more accurate now.
v1.11.0
- ✨ Added new
BatchClusterObserver
for error and lifecycle monitoring - 📦 Added a number of additional logging calls
v1.10.0
v1.9.1
- 📦 Changed
BatchProcess.end()
to use until()
rather than Promise.race
,
and always use kill(pid, forced)
after waiting the shutdown grace period
to prevent child process leaks.
v1.9.0
- ✨ New
Logger.setLogger()
for debug, info, warning, and errors. debug
and
info
defaults to Node's
debuglog,
warn
and error
default to console.warn
and console.error
,
respectively. - 📦 docs generated by typedoc
- 📦 Upgraded dependencies (including TypeScript 2.7, which has more strict
verifications)
- 📦 Removed tslint, as
tsc
provides good lint coverage now - 📦 The code is now prettier
- 🐞
delay
now allows
unrefing the
timer, which, in certain circumstances, could prevent node processes from
exiting gracefully until their timeouts expired
v1.8.0
- ✨ onIdle now runs as many tasks as it can, rather than just one. This should
provide higher throughput.
- 🐞 Removed stderr emit on race condition between onIdle and execTask. The
error condition was already handled appropriately--no need to console.error.
v1.7.0
- 📦 Exported
kill()
and running()
from BatchProcess
v1.6.1
- 📦 De-flaked some tests on mac, and added Node 8 to the build matrix.
v1.6.0
- ✨ Processes are forcefully shut down with
taskkill
on windows and kill -9
on other unix-like platforms if they don't terminate after sending the
exitCommand
, closing stdin
, and sending the proc a SIGTERM
. Added a test
harness to exercise. - 📦 Upgrade to TypeScript 2.6.1
- 🐞
mocha
tests don't require the --exit
hack anymore 🎉
v1.5.0
- ✨
.running()
works correctly for PIDs with different owners now. - 📦
yarn upgrade --latest
v1.4.2
- 📦 Ran code through
prettier
and delinted - 📦 Massaged test assertions to pass through slower CI systems
v1.4.1
- 📦 Replaced an errant
console.log
with a call to log
.
v1.4.0
- 🐞 Discovered
maxProcs
wasn't always utilized by onIdle
, which meant in
certain circumstances, only 1 child process would be servicing pending
requests. Added breaking tests and fixed impl.
v1.3.0
- 📦 Added tests to verify that the
kill(0)
calls to verify the child
processes are still running work across different node version and OSes - 📦 Removed unused methods in
BatchProcess
(whose API should not be accessed
directly by consumers, so the major version remains at 1) - 📦 Switched to yarn and upgraded dependencies
v1.2.0
- ✨ Added a configurable cleanup signal to ensure child processes shut down on
.end()
- 📦 Moved child process management from
BatchCluster
to BatchProcess
- ✨ More test coverage around batch process concurrency, reuse, flaky task
retries, and proper process shutdown
v1.1.0
- ✨
BatchCluster
now has a force-shutdown exit
handler to accompany the
graceful-shutdown beforeExit
handler. For reference, from the
Node docs:
The 'beforeExit' event is not emitted for conditions causing explicit
termination, such as calling process.exit() or uncaught exceptions.
- ✨ Remove
Rate
's time decay in the interests of simplicity
v1.0.0
- ✨ Integration tests now throw deterministically random errors to simulate
flaky child procs, and ensure retries and disaster recovery work as expected.
- ✨ If the
processFactory
or versionCommand
fails more often than a given
rate, BatchCluster
will shut down and raise exceptions to subsequent
enqueueTask
callers, rather than try forever to spin up processes that are
most likely misconfigured. - ✨ Given the proliferation of construction options, those options are now
sanity-checked at construction time, and an error will be raised whose message
contains all incorrect option values.
v0.0.2
- ✨ Added support and explicit tests for
CR LF, CR, and LF encoded streams
from spawned processes
- ✨ child processes are ended after
maxProcAgeMillis
, and restarted as needed - 🐞
BatchCluster
now practices good listener hygene for process.beforeExit
v0.0.1